NER in Archival Finding Aids: Extended

نویسندگان

چکیده

The amount of information preserved in Portuguese archives has increased over the years. These documents represent a national heritage high importance, as they portray country’s history. Currently, most have made their finding aids available to public digital format, however, these data do not any annotation, so it is always easy analyze content. In this work, Named Entity Recognition solutions were created that allow identification and classification several named entities from archival aids. translate into crucial about context and, with confidence results, can be used for purposes, example, creation smart browsing tools by using entity linking record techniques. order achieve result scores, we annotated corpora train our own Machine Learning algorithms domain. We also different architectures, such CNNs, LSTMs, Maximum Entropy models. Finally, all datasets ML models developed web platform, NER@DI.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Access to Archival Finding Aids: Context Matters

We detail the design of a search engine for archival finding aids based on an XML database system. The resulting system shows results—which can vary in granularity from individual archival items to the whole fonds—within the context of the archive. The presentation preserves the archival structure by providing important contextual information, and all individual results can be “clicked”, warpin...

متن کامل

Searching Archival Finding Aids: Retrieval in Original Order?

Archival principles as Provenance (keeping material from the same creator together) and its corollary Original Order (keeping the order of creation intact) could help improve access to the archival materials. We investigate the importance of relevance ranking and ‘Original Order’ when searching finding aids in EAD using XML Retrieval. Our experiment shows that relevance ranking is of paramount ...

متن کامل

THE IHPACT OF COMPUTERIZATION ON ARCHIVAL FINDING AIDS : A RAMP STUDY prepared by

The impact of computerization on archival finding aids: a RAMP study / prepared by Christopher Kitching [for the] General Information Programme and UNISIST. PREFACE In order to aid Member States, particularly developing countries, to meet their needs in the specialized areas of Archives Administration and Records Managemant, the Division of the General Information Programme has developed a long...

متن کامل

Modeling Archival Repositories for Digital Libraries Extended

This paper studies the archival problem: how a digital library can preserve electronic documents over long periods of time. We analyze how an archival repository can fail and we present diierent strategies that help solve the problem. We introduce ArchSim, a simulation tool that for evaluating an implementation of an archival repository system and compare options such as diierent disk reliabili...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine learning and knowledge extraction

سال: 2022

ISSN: ['2504-4990']

DOI: https://doi.org/10.3390/make4010003